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ABSTRACT. This paper reviews the current status of measurements of 
galaxy clustering at high redshifts (z ^ 0.3). The focus is on the inherent 
limitations in the observation and interpretation of the "evolution of cluster- 
ing". It is likely that results from the first attempts to characterize galaxy 
clustering beyond the "local" universe have been significantly limited by sam- 
ple variance, as the difficulty in assembling large samples over large volumes 
is exacerbated as the observations become more challenging. It is also argued 
that, because of the complicated relationship between galaxies and mass (i.e., 
bias), and the surprising degeneracies among different popular cosmological 
models, it is likely that studies of galaxy clustering as a function of cosmic 
epoch will never be useful for strong discrimination between different cos- 
mological models. On the other hand, observations of galaxy clustering are 
capable of testing basic ideas about how (and where) galaxies form. Galaxy 
formation, as opposed to cosmography, will probably remain a fundamental 
question even beyond the MAP and Planck era. 



1 Introduction 

We are clearly living in an era in 
which the relatively nearby universe 
will be mapped out with exquisite 
precision with new dedicated tele- 
scopes and instruments. The aims 
of such surveys seem clear: to use 
galaxies as a means to map out the 
large— scale structure of the universe, 
in the hope that in so doing one can 
understand the details of the rela- 
tionship between observable galax- 
ies and the overall matter distribu- 
tion, and ultimately, to test theories 
of structure formation and to mea- 
sure the details of the power spec- 
trum of mass fluctuations on scales 
that are capable of testing theoret- 
ical ideas about the origins of the 
fluctuations. Even at "zero" redshift, 
as we have heard as a major theme 
at this conference, there is consider- 
able argument about the degree to 
which galaxies should be trusted as 
tracers of mass; in particular, we now 
know that the clustering properties 
of galaxies are not universal, but de- 
pend on galaxy color, luminosity, and 
other messy astrophysical properties 
that are correlated with, but not di- 
rect proxies for, mass. The days of 
treating all galaxies in a redshift sur- 
vey as identical test particles are al- 
most certainly over; the degree to 



which one must worry about things 
like population mixes and luminos- 
ity and color segregation depend on 
the nature and scale of the measure- 
ment being made. The point here is 
that to understand large— scale struc- 
ture as traced by galaxies, one needs 
also to understand something about 
the galaxies themselves. This is ei- 
ther annoying, or incredibly interest- 
ing, depending on one's perspective. 

For redshifts where the evolving 
properties of galaxies begin to be im- 
portant, and where the quantity of 
available information drops off pre- 
cipitously, we are still in very much 
an exploratory phase. At z 0.3, 
there has been a great deal of 
progress, but we are happy enough 
to have any measurements at all— 
the kind of careful scrutiny to which 
the current and future major "local" 
redshift surveys have been subjected 
has not yet descended. This rela- 
tively information-starved regime is 
the subject of this short review. 

Now that observations of large 
samples of high redshift galaxies 
are feasible from a purely techni- 
cal standpoint, it is worth revisiting 
the question of just what one learns 
by studying the evolution of cluster- 
ing with redshift. It is also worth 
considering in some detail how the 
selection of galaxies, and the over- 



2 



C. C. Stoidol 



all design of a survey, may signif- 
icantly influence results. The orga- 
nization of this review is to first 
discuss the status of observations 
of galaxy clustering at z !S 1; the 
observations here consist of tradi- 
tional apparent-magnitude selected 
rcdshift surveys designed primarily 
for studying galaxy evolution, to 
new wide— angle imaging surveys and 
applications of photometric rcdshift 
techniques. This will be followed by 
a discussion of results and prospects 
using large photometric and spectro- 
scopic surveys at z ^ 3. 

2 "Evolution" of the Correla- 
tion Function 

Because of the relatively small 
numbers of galaxies in most of the 
high rcdshift samples, simple statis- 
tics arc generally used to describe the 
overall level of clustering. Many have 
described the clustering in terms of 
the Groth & Peebles (1977) param- 
eterization of the two-point correla- 
tion function, 

€(r,z)= (^_V (l+z)-( a +^>(l) 
\r (z) I 

where r is the co-moving coordinate 
distance, ro is the co— moving correla- 
tion length, 7 is the slope of the cor- 
relation function and e is the "evo- 
lutionary parameter" . In the context 
of this parameterization, with 7 = 
— 1.8, one obtains the usual limit- 
ing cases of e = —1.2 for cluster- 
ing fixed in co-moving coordinates 
(i.e., ro(z) =constant), e = for 
clustering fixed in physical (proper) 
coordinates (ro oc (1 +z)~ 2 / 3 ), and 
e = 0.8 for linear growth of 
clustering in an Einstein-de Sitter 
universe [ro(z) oc (1 + z) _1 , approx- 
imately] . This would be a very good 
way of thinking about the evolution 
of clustering if it were the case that 
1) one were seeing the same galax- 
ies at all redshifts z, and 2) if galaxy 
clustering were a monotonic function 
of scale factor. In practice, 1) is al- 
most certainly not the case, since 
selection effects and galaxy evolu- 
tion conspire to bring different types 
(masses?) of galaxies into and out of 
samples as a function of redshift, and 
quite probably 2) is also not satisfied. 



As has been discussed by several 
others at this conference (e.g., Pea- 
cock in these proceedings), a generic 
result of simulations and/or analytic 
models is that the correlation func- 
tion for dark matter halos actually 
evolves very differently from the cor- 
relation function of the mass, and 
actually passes through a minimum 
at intermediate redshifts after having 
begun in a highly biased state when 
fluctuations of the particular mass 
threshold were very rare (see, e.g., 
Brainerd & Villumsen 1994; Bagla 
1998b). In this picture, the evolution 
of the bias of the dark matter ha- 
los out-paces the growth of matter 
fluctuations (so that the galaxy clus- 
tering becomes weaker in co— moving 
units with decreasing redshift) until 
a characteristic redshift at which the 
mass threshold is a ~ la fluctuation, 
after which the clustering strength 
increases again. For halos typical of 
bright galaxies today, ~ 10 12 Mq, 
this clustering minimum is expected 
to occur near z ~ 1. Thus, under 
this picture, the observed evolution 
of clustering would depend upon the 
characteristic halo mass (or mix of 
mass scales) traced by galaxies sat- 
isfying the particular sample selec- 
tion criteria at each redshift. Even 
if one could somehow isolate galax- 
ies of fixed mass, the evolution of 
the clustering would not fit in well 
with the e parameterization shown 
above. In the real world, one is likely 
to be observing a complicated mix 
of galaxies/mass scales that is likely 
to be changing as a function of red- 
shift, so that what one observes in 
a sample is a superposition of such 
complicated evolutionary sequences. 
The prediction would then be that 
describing the evolution of cluster- 
ing with a single value of e that re- 
sults from a "best-fit" to a heteroge- 
neous sample (whether it be hetero- 
geneous with respect to color, lumi- 
nosity, etc.) is not likely to provide 
information that is particularly use- 
ful in understanding what is going 
on. Many have now suggested dis- 
pensing with the e parameterization 
altogether; this is an excellent sug- 
gestion. 
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3 Clustering at 2 ^ 1 

Until very recently, the only in- 
formation on the clustering of dis- 
tant galaxies that was available was 
based on studies of the angular corre- 
lation function of faint galaxies (e.g., 
Efstathiou et al. 1991, Efstathiou 
1995, Brainerd et al. 1995, Postman 
et al. 1998). The general result was 
much weaker clustering amplitude 
for faint galaxies than seen in local 
galaxy surveys, but because of the 
unknown extent of projection effects 
due to a lack of detailed knowledge 
of the rcdshift distribution ./V(z) [in 
some cases the clustering results were 
used to place constraints on the be- 
havior of N(z)] and the uncertainty 
that the galaxies seen at faint mag- 
nitudes are the same objects counted 
in local galaxy surveys, the impli- 
cations were somewhat ambiguous. 
A tendency was seen in several of 
the above surveys for a flattening in 
the angular correlation function am- 
plitude at the faintest magnitudes; 
however, the degeneracy between the 
issue of the redshift distribution and 
the evolution of the clustering does 
not permit the solution for one with- 
out knowledge of the other. This de- 
generacy/projection problem in the 
imaging surveys can be overcome, to 
some extent, by using multiple colors 
to assign photometric redshifts, as 
discussed by A. Connolly at this con- 
ference. While there are some limi- 
tations to the photometric redshift 
technique, it appears to work very 
well for z < 1 (e.g., Hogg et al. 1998 
and references therein). 

A first attempt at using a deep 
spectroscopic survey for measuring 
the spatial correlation function of 
faint galaxies (in this case at a me- 
dian rcdshift of z = 0.16) was made 
by Cole et al. (1994), who obtained 
a correlation function that was in- 
distinguishable, in co-moving units, 
from the local correlation function. 
There have been a number of sub- 
sequent spectroscopic surveys which 
have addressed galaxy clustering at 
z ^ 0.3, many of which are summa- 
rized in Table 1. This summary in- 
cludes minimal details describing the 
results on the co-moving correlation 
lengths, in particular, from the var- 
ious surveys. There are other large 
spectroscopic surveys which reach 
similar rcdshift depths whose results 



were in preparation at the time of 
this meeting (e.g., Cohen et al. 1999, 
Small, Sargent, & Ma 1999) and 
which have not yet reported actual 
measured correlation functions in 
the literature. However, the Caltcch 
Deep Rcdshift survey (a K-sclected 
sample) of Cohen et al. (1999) has 
reported a strong tendency for ob- 
jects that are red in their opti- 
cal/IR colors and have absorption- 
dominated spectra to preferentially 
inhabit the most prominent struc- 
tures in redshift space; the kinemat- 
ics of these structures suggest that 
they are groups or poor clusters. 
There is clear evidence for both lu- 
minosity and color segregation in the 
clustering properties, but the quan- 
titative comparison with the surveys 
presented in Table 1 is not yet possi- 
ble. 

A benchmark piece of work in 
terms of measuring the actual evo- 
lution of the correlation function as 
a function of redshift, within the 
same survey, was that of Le Fevre 
et al. (1996) from the Canada-France 
Redshift Survey (CFRS). As can be 
seen in Table 1, the CFRS team had 
a large enough sample, over a large 
enough rcdshift range, that the data 
could be binned into rcdshift subsets. 
The measurements of the correlation 
lengths versus redshift, and of the 
sample as a whole, seemed to show 
strong evolution of the galaxy corre- 
lation function in the sense that the 
correlation strength was significantly 
weaker in the past; it was argued that 
is was very unlikely that they were 
seeing different galaxies as a func- 
tion of redshift within the sample, 
so that for the first time one could 
see the growth of clustering of galax- 
ies directly. Interestingly, the CFRS 
saw no evidence for color segrega- 
tion (based on optical colors) of the 
clustering of galaxies for redshifts be- 
yond 2 ~ 0.3. 

However, a glance at Table 1 
makes one worry slightly. Compare, 
for example, the preliminary CNOC- 
2 results at 2 ~ 0.35 with the lowest 
rcdshift bin of the CFRS survey, or 
with the results of the Hawaii Deep 
Survey K-band sample. The CNOC- 
2 sample finds significantly stronger 
clustering even for the "faint" sub- 
sample, and much stronger cluster- 
ing for the "bright" sub-sample; the 
Hawaii Deep Survey, albeit a rela- 
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Table 1. Non-exhaustive summary of recent galaxy clustering results to z ~ 1. 
Correlation lengths are in co-moving units. 

Survey (z) Ngal ro/h Mpc Comments 



CNOC-1 


0.35 


140 


2.7 ±0.6 


1 


CFRS 


0.34 


186 


1.8 ± 0.2 


2 




0.62 


196 


1.8 ± 0.2 


2 




0.86 


130 


1.9 ± 0.2 


2 


(Full CFRS Sample) 


0.53 


591 


2.2 ± 0.1 


2 


Hawaii Deep 


0.34 




3.9 ±0.2 


3 




0.62 




3.2 ± 1.1 


3 




0.97 




2.8 ± 1.2 


3 




1.39 




2.4 ± 1.2 


3 


Red Sub-sample 


0.6 


~ 100 


3.8 


3 


Blue Sub-Sample 


0.6 


~ 150 


1.4 


3 


KPNO I-band 


(0.5) 


4e5 


4.5 ± 0.6 


4 


CNOC-2 (bright) 


0.35 


~ 1500 


5.0 ±0.2 


5 


CNOC-2 (faint) 


0.35 


~ 1500 


3.6 ±0.2 


5 



1) Shepherd et al. (1996); found best-fit e ~ 1 ± 1 

2) Le Fevre et al. (1996); 5 10' fields, to I AB = 22.5. 

Found 7 = —1.6, e ~ — 2, and no color segregation for z > 0.3. 

3) Carlberg et al. (1997); K-selected sample, total ~ 250 z's. 
Assumes fixed 7, go = 0.1. Clear color segregation. 

4) Postman et al. (1998); w{ff) de-projection using CFRS 
rcdshift distribution and magnitude cuts. 

5) Carlberg et al. (1998); sample split into "bright and faint" 
at Ma = —20; find best fit evolutionary model is 

r (z) oc (l+z)-°- 3±0 - 2 . 



tively small sample, finds clear evi- 
dence for different clustering for red 
and blue sub-samples, and only the 
blue sub-sample exhibits a correla- 
tion length comparable to the CFRS 
results at comparable redshifts. If 
one takes all of the results together, 
there seems to be relatively weak, 
but significant, evolution of cluster- 
ing in the sense that it is slightly 
less strong in the past. But individ- 
ual survey results exhibit a very wide 
scatter-not at all consistent with the 
quoted uncertainties on the measure- 
ments, and inferred values of the evo- 
lutionary parameter e arc all over 
the map, from slightly negative to 
strongly positive. In the largest spec- 
troscopic sample, CNOC-2, there ap- 
pears to be both luminosity and color 
segregation present, in the sense that 
redder and more luminous galaxies 
are more strongly clustered (as is 
seen in some local galaxy surveys, 
e.g. Loveday et al. 1995). This survey 
does not reach high enough redshifts 
to make detailed comparisons at the 
median rcdshift of the CFRS sample. 
If there are lessons learned from 



the work that has been completed 
recently in the z ^ 1 regime, it is 
that most of the samples have prob- 
ably not been large enough to yield 
universal results on the galaxy cor- 
relation function- there is a trend 
of increasing measured ro at a given 
rcdshift with increasing sample size, 
a telltale sign that sample variance 
has been a problem. The results 
of the very large photometric sur- 
vey by Postman et al. (1998) per- 
haps illustrates the problem best, in 
that one can isolate many indepen- 
dent sub-samples that are as large 
as (for example) the CFRS, and it 
is clear that in a sample the size 
of the CFRS there is a quite large 
probability of measuring clustering 
strength that is significantly smaller 
than that observed over very large 
volumes. Using the CFRS photomet- 
ric selection criteria for their angular 
correlation function, and the CFRS— 
observed N(z) for de-projecting to 
form the real-space correlation func- 
tion, Postman et al. obtain a value 
for the correlation length that is 
more than two times larger, at the 
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median rcdshift, than that obtained 
by Le Fevre et al. , a difference that 
is significant at about the 4<r level 
if one takes the error bars at face 
value. On the other hand, the same 
large imaging survey of Postman et 
al. , while making a measurement 
of unprecedented precision in terms 
of sheer reduction in sample vari- 
ance and counting statistics, is un- 
able to distinguish between a num- 
ber of widely different models for 
the growth of clustering; this indi- 
cates the inadequacy of the e pa- 
rameterization in general, and the 
need for cutting down on projec- 
tion effects that hamper deep imag- 
ing surveys intended for clustering 
analysis. One very sensible means 
of overcoming this limitation is to 
use multi-bandpass imaging and the 
photometric redshift method to iso- 
late cosmic epochs, galaxy luminos- 
ity classes, and color cuts, explor- 
ing the behavior of the clustering 
as a function of galaxy properties 
in a multi-dimensional manner. In 
doing this, one clearly increases the 
noise in any clustering estimate, but 
it is almost certainly worth sacrific- 
ing precision in favor of information 
content. Most useful of all, although 
rather painstaking in terms of the re- 
sources necessary to obtain the ob- 
servations, are wide angle, deep spec- 
troscopic surveys such as CNOC- 
2 (and future surveys such as that 
planned by the DEEP collaboration- 
see Davis & Faber 1998) where in 
addition to simple correlation statis- 
tics, precision galaxy redshifts might 
yield more detailed dynamical infor- 
mation that might help relate the 
galaxy clustering to the clustering of 
the dominant dark matter compo- 
nent on small scales. Both large vol- 
umes and accurate redshifts, with a 
good sampling of redshifts using sen- 
sible selection criteria, will be invalu- 
able. 



4 Clustering at z >> 1 

Given that we do not really under- 
stand the details of what is going on 
with galaxy clustering at intermedi- 
ate redshifts, why would one want to 
bother exploring the higher rcdshift 
universe, where the problem of mak- 
ing clear evolutionary connections 
to galaxies at the present is even 



more difficult? The simplest answer 
is that the pure novelty of compiling 
a large sample of high redshift galax- 
ies would be bound to yield some- 
thing interesting; naively, one might 
have thought that one of the clean- 
est possible tests of the idea that 
matter fluctuations grow by gravi- 
tational instability would be to ob- 
tain a "snapshot" of galaxy clus- 
tering at very high redshifts, where 
one might expect the clustering to 
be much weaker if gravity really 
were the dominant factor in produc- 
ing the structure observed in the lo- 
cal universe. Also naively, one might 
have expected that the amount by 
which the clustering should be differ- 
ent would be rather sensitive to f2 m , 
with smaller differences expected for 
lower H m . All of this would be true 
if galaxies formed at infinite rcdshift 
and then evolved quiescently to the 
present epoch, acting like conserved 
test particles. As it turns out, the 
clustering is a sensitive test of our 
collective wisdom about how galaxies 
form, but probably a relatively poor 
cosmological discriminant. 

The method used for compiling 
samples of very high redshift galax- 
ies to date has been almost ex- 
clusively the "Lyman— break" tech- 
nique, where one makes use of the 
essentially guaranteed "break" in the 
spectrum of high redshift star form- 
ing objects at 912 A in the rest 
frame due to photo-electric absorp- 
tion both in the galaxy itself and in 
the intcrgalactic medium. The fea- 
ture is so strong that the coarse spec- 
trophotometry allowed by broad- 
band imaging can be used to isolate 
particular ranges of rcdshift that can 
be controlled based on the adopted 
filter system. The technique as suc- 
cessfully implemented has been de- 
scribed recently in many places (e.g., 
Steidel, Pettini, & Hamilton 1995, 
Steidel et al. 1996, 1998b, Madau et 
al. 1996), so I will not go into any 
details here. Basically, the method 
is a highly efficient means of select- 
ing a nearly volume-limited sample 
of objects on the basis of their rest- 
frame far— UV luminosity. As with 
any other galaxy sample, it is im- 
portant to understand what selec- 
tion effects are implicit in the detec- 
tion technique: here, one is quite in- 
sensitive to stellar mass, but sensi- 
tive almost exclusively to unobscured 
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high-mass star formation. Extremely 
dusty galaxies, or galaxies which 
have ceased forming stars prior to 
the epoch at which they arc ob- 
served (or those going through a qui- 
escent phase between star formation 
episodes) are unlikely to be included. 

The particular implementation of 
the Lyman break technique for which 
the most data have been obtained to 
date selects galaxies in the redshift 
range 2.7 ,$ z ^ 3.4, as shown in Fig- 
ure 1. The primary goal of the sur- 
vey is to accumulate a large enough 
sample of high redshift galaxies that 
proper statistics on the luminosity 
distribution, spectral properties, red- 
dening, and, most relevant here, their 
large-scale distribution, are possible. 
The survey is in many ways similar 
to the CFRS in design: there are 5 
primary survey regions, typically 9' 
by 18 ' in angular size — comparable 
to that of the CFRS, but the actual 
transverse co-moving scale is much 
larger on account of the much higher 
redshift of the survey. For example, 
the co-moving size of a CFRS field 
at z ~ 0.6 is 4.5h Mpc on a side 
(inside which the CFRS densely sam- 
pled 3 10' by 2' strips), as compared 
to 11. 6b- 1 Mpc by 23.3b- 1 Mpc for 
the LBG fields at z ~ 3, for Q m = 
0.3, tt A = 0.7. The CFRS depth 
along the line of sight, on the other 
hand, is between 2 and 4 times larger 
than in the LBG fields, depending on 
cosmology. Practical matters, mainly 
having to do with the faint apparent 
magnitudes of even the more lumi- 
nous LBGs, limit the sampling den- 
sity for galaxies at z ~ 3 in these 
larger volumes, so that scales much 
smaller than the field size are not 
probed well. This bears significantly 
on what types of clustering statis- 
tics can be measured well using the 
data. Each LBG survey field samples 
an effective co— moving volume of ~ 
2.2 X 10 4 h~ 3 Mpc 3 for an Einstein- 
De Sitter model (~ 8.3 X 10 4 h~ 3 
Mpc 3 for fl m = 0.3 and Q A = 0.7), 
so that the total volume surveyed is 
somewhere between 10 5 and 10 6 h — 3 
Mpc 3 . 

Within these survey regions, to 
the adopted apparent magnitude 
cutoff of TZab = 25.5, there arc ~ 
1500 photometrically selected candi- 
dates, and the aim is for a total spec- 
troscopic sample of ~ 750. Although 
the survey is not yet completed, sev- 



eral interim results have already been 
published. Many of the results have 
been discussed elsewhere, e.g. Steidel 
et al. 1998a,b, Adelberger et al. 1998, 
Giavalisco et al. 1998, so the discus- 
sion here will be brief. 

First, as can be seen in Figure 
2, the Lyman break galaxies are 
strongly clustered, and large over- 
densities, or "spikes" in the red- 
shift distribution are evident in each 
survey field. The galaxies within 
these over-densities are not obvi- 
ously concentrated on the plane of 
the sky, and angular correlations of 
the photometric samples of Lyman 
break galaxies have rather poor S/N 
because of the aforementioned low 
surface density and therefore poor 
sampling of the small scales where 
most of the angular correlation sig- 
nal would lie. At present, the most 
robust statistic for evaluating the 
level of clustering for the LBGs is a 
"counts— in— cells" analysis. Here one 
simply counts the number of objects 
in cubical cells of roughly 10h _1 Mpc 
on a side (the scale being defined 
by the transverse size of the survey 
field, which of course varies as a func- 
tion of cosmology) in the spectro- 
scopic sample, corrects for the shot- 
noise contribution to the variance 
and modest redshift— space distor- 
tions, and evaluates <7 ce n (see Adel- 
berger et al. 1998). The variance in 
galaxy counts (relative to the overall 
expectation value from the selection 
function) in cells of this size is very 
closely related to the commonly- 
used "erg" statistic for normalizing 
mass fluctuations using cluster abun- 
dances (see, e.g., White et al. 1993, 
Eke, Cole, & Frenk 1996). Since it is 
straightforward to compute the ex- 
pected mass fluctuations on the same 
scales at z ~ 3 (using present-day 
cluster normalization) for a given 
cosmology, one can essentially "read 
off" the required galaxy effective bias 
on the scale of a "cell" from a plot 
similar to the ones shown in Figure 
2. The most recent numbers result- 
ing from such an analysis of the cur- 
rent LBG survey sample, assuming 
cluster normalization of Eke, Cole, & 
Frenk (1996) are 

{5.2 ±0.9 fi m = 1 
3.6 ±0.6 n m = 0.3, tt A = 0.7 
1.7 ±0.4 n m = 0.2 
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Confirmed Lyman — break Galaxies 



624 Galaxies, z>2.2 
As of Nov 98 



2. 5 3 3.5 

Redshift 



Figure 1. Redshift histogram of Lyman break galaxies in the z ~ 3 sample, 
selected using color criteria in the U n G1Z color-color plane from wide-field 
ground-based images. All of the confirming spectroscopic redshifts were ob- 
tained with the Keck telescopes and the Low Resolution Imaging Spectrograph 
(Oke et al. 1995). 



where b c ff is the effective linear bias 
on scales of ~ 10h -1 Mpc. 

Thus, it can be seen that bright 
LBGs must be strongly biased trac- 
ers of mass fluctuations in order to 
be accomodated easily by standard 
hierarchical models. It is somewhat 
more difficult, given the nature of 
the data, to turn this into a corre- 
lation function (although, of course, 
b e fj is equivalent to an integral over 
the correlation function over the cell 
volume)-while we have attempted 
to de-project the angular correla- 
tion function of LBGs to obtain a 
real-space correlation function (Gi- 
avalisco et al. 1998), the current sam- 
ple is not well-suited to measuring 
w(8) nor w p (8) accurately. The cen- 
tral problem is that the depth of one 
of our survey fields (in co-moving 
units) far exceeds its width, and as 
a result the vast majority of angular 
pairs are simply chance projections 
of galaxies at very different redshifts. 
For example, even at separations as 
small as 20" approximately 90% of 
galaxy pairs in our sample are chance 
projections (Giavalisco et al. 1998). 
Including these chance pairs in a 
clustering analysis — as is required 
for w(6) and some forms of w p — 
results in a disastrous reduction in 



signal-to— noise ratio. This seems an 
unnecessarily high price for avoiding 
peculiar velocity distortions, which 
are after all relatively minor on large 
scales at these redshifts. Measuring 
w(6) is a reasonable approach when 
the sample is large enough that ran- 
dom fluctuations in the number of 
chance pairs are small compared to 
the number of true pairs, and is of- 
ten the only approach when only a 
small fraction of the galaxies in the 
sample have redshifts; however, nei- 
ther condition is true for the z ~ 3 
LBG sample. 

Thus far we have been quoting a 
number for the correlation lengths 
that is based on assuming a value 
for the slope 7 of the correlation 
function (Adelberger et al. 1998), 
and these number range from 4-6h 
Mpc, with the lower end of the range 
applying to Einstein-de Sitter and 
the upper range to low Q m mod- 
els. We plan to do a more care- 
ful job with this when the spectro- 
scopic sample has been completed, 
which we anticipate will be quite 
soon. Regardless of the precise val- 
ues for ro , the clustering of the 2 ~ 3 
LBGs is as strong, or stronger, than 
most local galaxy samples, and sig- 
nificantly stronger than most of the 
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Redshift Redshift 

Figure 2. Redshift histograms in individual fields of the z ~ 3 Lyman break 
galaxy survey. Each field is 150-225 square arc minutes in size. The light 
smooth histograms in each case is the overall redshift selection function for 
the survey, normalized to contain the same number of galaxies as observed in 
each field. Note the presence of strong "spikes" in the redshift distribution, 
and a few significant "voids" as well. Each bin encompasses a volume on the 
order of 1000h~ 3 Mpc 3 



intermediate redshift numbers. It is 
still unclear whether we have sam- 
pled enough volume to asymptote 
to the "truth" , but these correlation 
lengths are likely to be firm lower 
limits. 

Strong clustering/bias is expected 
theoretically for rare peaks in the 
density field (Kaiser 1984), and many 
theoretical papers, both predating 
and interpreting the clustering obser- 
vations, can easily explain the strong 
clustering of the LBGs through this 
"high peaks" biasing (e.g., Baugh et 
al. 1998, Coles et al. 1998, Wech- 
sler et al. 1998, Bagla 1998a, Mo and 
Fukugita 1996, Jing & Suto 1998, 
Governato et al. 1998, Mo, Mao, 
& White 1998, Katz, Hernquist, & 
Weinberg 1998). One can get quite 
good agreement with both the abun- 
dance and clustering properties of 
dark matter halos (using either N- 
body or analytic techniques) and the 
real galaxies provided that the typi- 
cal LBG in the sample is associated 
with a dark matter halo mass scale of 
~ 10 12 Mq (Steidel et al. 1998a,b; 



Adelberger et al. 1998; Mo, Mao, 
& White 1998). This good agree- 
ment suggests that there should be a 
monotonic relationship between dark 
matter halo mass and UV luminos- 
ity, and that most, if not all, dark 
matter halos of a given mass con- 
tain a LBG exhibiting a star forma- 
tion rate with relatively small scatter 
(Adelberger et al. 1998). This pro- 
vides empirical evidence that the use 
of star formation prescriptions that 
are based on the dark matter halo 
properties (as in most semi-analytic 
models of galaxy formation) may be 
on the right track. 

A power spectrum shape param- 
eter of r ~ 0.2 is most consistent 
in matching the inferred bias and 
the abundance of galaxies and dark 
matter halos, but otherwise surpris- 
ingly little difference is expected for 
the clustering of objects of a given 
abundance among the currently pop- 
ular dark matter models which have 
this kind of shape parameter (i.e., 
tCDM, open CDM, A CDM). As dis- 
cussed by Steidel et al. 1998b, Adel- 
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berger et al. 1998, and Giavalisco et 
al. 1999, a more stringent test of 
such a simple association of LBGs 
with dark matter halos in a hi- 
erarchical model would come from 
examining the clustering of much 
fainter LBGs, which would presum- 
ably trace much smaller mass dark 
matter halos which should be sig- 
nificantly less clustered at high red- 
shift. Preliminary indications show 
that most of the models remain con- 
sistent with the data when faint LBG 
samples from the HDF are com- 
pared with the ground based results, 
with significantly smaller correlation 
lengths for objects with abundances 
~ 20 times larger than the bright 
galaxies in the ground-based survey. 
Larger samples, particularly of the 
faint objects, will be required in or- 
der to be able to exert much pressure 
on any of the currently popular dark 
matter models. 

On the other hand, there is a very 
substantial difference among the var- 
ious models for the masses of ob- 
jects of a given abundance and clus- 
tering level. This difference is large 
enough (a difference of a factor of 
~ 3 in circular velocity) between low 
f2 m models and r-CDM that obser- 
vations of line widths (even with all 
of the inherent uncertainties in using 
them for dynamical mass estimates) 
may be able to resolve the degener- 
acy. The main problem is that line 
widths are essentially always provid- 
ing lower limits on v c , and some theo- 
retical predictions suggest (e.g., Mo, 
Mao, & White 1998) that the ob- 
served line widths may not be rad- 
ically different despite the very dif- 
ferent v c because of the fact that 
the highest star formation efficiency 
would occupy regions that are still 
on the rising part of the rotation 
curve. Some observations along these 
lines have already been attempted 
using the familiar nebular lines in the 
rest-frame optical (observed near- 
IR) (Pcttini et al. 1998), but the 
advent of IR spectrographs on 8— 
10m telescopes should result a huge 
amount of progress in this area. 

The UV spectra of LBGs, on the 
other hand, represent possibly the 
most frustrating limitation of the 
spectroscopic samples. While in prin- 
ciple the rcdshift accuracy achiev- 
able at z ~ 3 is the same as one 
could achieve at intermediate red- 



shift, the problem is that essentially 
none of the commonly-observed far— 
UV lines is trustworthy as an indi- 
cation of the systemic redshift of the 
galaxy. We have estimated the intrin- 
sic uncertainty (independent of mea- 
surement errors) to be on the order 
of ~ 300 km s . This means that it 
may be difficult to explore any statis- 
tics based on small-scale dynamics 
(e.g., pairwise velocity dispersions) 
without wholesale IR spectroscopy. 

4.1 General Implications 

The nature and clustering of the 
z ~ 3 LBGs are very consistent with 
the overall "paradigm" that galax- 
ies would form at the highest, "bi- 
ased" peaks in the dark matter dis- 
tribution at early epochs, and that 
these objects should be strongly clus- 
tered at high rcdshift. These cluster- 
ing properties, together with the ob- 
served space densities, imply the in- 
dividual galaxies arc associated with 
dark matter halos of the order of 
~ 10 12 Mq. Within the context of 
these models, a large fraction of the 
LBGs seen in the bright ground- 
based samples would end up in rich- 
est environments in the present-day 
universe, and the prominent "spikes" 
at z ~ 3 are likely to be the progeni- 
tors of present— day rich galaxy clus- 
ters (e.g., Steidel et al. 1998a, Gov- 
ernato et al. 1998). The incidence of 
these prominent over-densities is in- 
deed broadly consistent with this hy- 
pothesis. Observations of the cluster- 
ing properties as a function of space 
density for LBGs over a wide range 
of luminosities have the potential 
to measure the shape of the power 
spectrum on scales of ~ 1 — 10h — 1 
Mpc (i.e., between galaxy and cluster 
scales) if the case can be made that 
the UV luminosity really is a good 
proxy for dark matter halo mass. 
Here again the problem is somewhat 
circular, in the sense that the dark 
matter models can be tested rigor- 
ously only if some observable prop- 
erty of the galaxies can be closely 
tied to the mass, but if one knew 
the underlying structure of the dark 
matter, it would be possible to close 
in on how star formation is related 
to the dark matter distribution. At 
present, the assumption that UV lu- 
minosity and dark matter halo mass 
are very closely related, with a power 
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spectrum shape very close to that 
which works best to explain the local 
large scale structure (see e.g., Pea- 
cock & Dodds 1996), works very well 
indeed, but this solution cannot be 
said to be unique at this point in 
time. 



5 z ~ 4 and Beyond 

At the time of this writing, a 
handful of galaxies have been iden- 
tified beyond z ~ 5. Should one be 
thinking about large surveys beyond 
z ~ 3? As always, the answer de- 
pends on what it is one wants to 
learn. It is conceptually straightfor- 
ward to locate higher redshift galaxy 
candidates using variations on the 
Lyman break technique, particularly 
with good sensitivity in the ncar-IR 
for redshifts beyond z ~ 5 or so. 
However, practical matters will prob- 
ably prevent large and successful sur- 
veys useful for examining large scale 
structure. We have recently com- 
pleted a pilot spectroscopic survey 
for LBGs at redshifts 2^4 (Steidel 
et al. 1999), and find that the sur- 
face density of candidates objects to 
Iab = 25.0 is just barely high enough 
to take advantage of multiplexing us- 
ing imaging spectrographs on 8— 10m 
class telescopes. Fainter than this, 
because of the much brighter back- 
ground one must fight to get spec- 
tra of the higher redshift objects 
(the features that secure the red- 
shifts tend to be in the range 1200 — 
1700 A in the rest-frame) , anything 
close to spectroscopic completeness 
would be extremely painful. On the 
other hand, if one can get around 
the more significant contamination 
by interlopers in the z ~ 4 samples 
(~ 20%) it might be possible to use 
de-projection of the angular corre- 
lation function of photometrically- 
selected candidates to quite faint 
magnitudes using the new generation 
of wide-field imagers. The question is 
whether the clustering of similarly- 
selected objects at z ~ 4 is providing 
much additional information over the 
much more easily-easured statistics 
at z ~ 3. 

On the other hand, it has been 
possible to compare the properties 
of the galaxies even in modest-sized 
samples at z ~ 4 and z ~ 3 using the 
ground-based surveys. This is a bit 



of a diversion from the topic of large- 
scale structure in general, but per- 
haps serves as an interesting example 
of how sample variance can lead to 
somewhat misleading results, and in 
any case allows me to present a new 
result that had just been obtained at 
the time of the meeting in August 
1998. Most readers would be familiar 
with the very exciting results from 
the Hubble Deep Field regarding the 
star formation history of the universe 
as revealed by various galaxy sur- 
veys, with the highest redshift points 
being obtained using Lyman break 
galaxies within the ~ 5 square arc 
minute HDF. The implication from 
the work of Madau et al. (1996) and 
follow-up papers is that the UV lu- 
minosity density of the universe, a 
proxy for the total SFR as a func- 
tion of cosmic epoch, reached a peak 
somewhere in the neighborhood of 
z ~ 2 and declined steadily beyond 
that redshift. Fearing that perhaps 
the HDF might not be a representa- 
tive region of the universe, we have 
compiled a sample covering about 
830 square arc minutes (~ 160 times 
larger area than the HDF) to the rel- 
atively bright magnitude of Iab = 
25.0, using photometric selection de- 
signed to be as analogous as possible 
to the one implemented at z ~ 3, and 
spectroscopic redshifts for about 50 
z ~ 4 galaxies to secure the redshift 
selection function. A comparison of 
the luminosity density represented in 
the bright ends of the z ~ 3 and 
z ~ 4 luminosity functions indicates 
that the luminosity density is essen- 
tially constant in the two redshift 
ranges (sec Figure 3), and indications 
are that the HDF is under-dense in 
z ~ 4 galaxies relative to a survey 
covering a much larger volume. For 
amusement, I have reproduced a fig- 
ure showing the latest incarnation of 
the "star formation history diagram" 
in Figure 4, showing previous results 
as well as the new points that come 
purely from the large ground— based 
surveys. The moral of the story is 
that one can never survey too much 
volume, and one should always be 
concerned about the lumpiness of 
the universe when convincing oneself 
that one is seeing something "univer- 
sal". 
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Figure 3. Luminosity functions obtained by combining our ground-based, 
wide-area surveys, with data on faint objects in the Hubble Deep Field. The 
HDF points are based on the catalogs of LBGs presented by Dickinson (1998) 
and Madau et al. 1998, but we have reanalyzed the effective survey volumes 
with knowledge of the true color distributions of the LBGs based on our spec- 
troscopic samples. The bright ends of the z ~ 3 and z ~ 4 luminosity func- 
tions are strikingly similar, in both shape and normalization (the z ~ 4 curve 
is simply the fit at z ~ 3 shifted by the distance modulus between z = 3.04 
and and z = 4.13, with the normalization multiplied by 0.8). The integrals 
of UV luminosity to the eguivalent of I\b — 25 in the higher z sample is 
within about 20% of that over the same luminosity range at z ~ 3, indepen- 
dent of cosmology. Note the indications that the HDF is underdense in z ~ 4 
galaxies. 



6 Concluding Remarks 

Whereas the evolution of cluster- 
ing of galaxies used to be seen as 
a sure-fire cosmological test, this 
somewhat naive view has at this 
point probably gone the way of all 
other cosmological tests that hoped 
to ignore the vagaries of galaxy evo- 
lution and treat galaxies like test 
particles. If there are any points that 
I would hope to get through in this 
highly qualitative talk (the data do 
not yet justify anything much more 
sophisticated!), it is that one needs to 
cover large volumes in order to hope 
to have reliable estimates of even 
the simplest statistics, one needs to 
worry a great deal about compar- 
ing apples to oranges in comparing 
galaxy samples at different cosmic 
epochs, and one must remain highly 



suspicious of what galaxy cluster- 
ing is telling one about the develop- 
ment of structure. It is essential to 
be able to isolate cosmic epochs to 
avoid being overwhelmed by compli- 
cated projection effects — photomet- 
ric redshifts seem like a very power- 
ful tool for surveying both large vol- 
umes and being able to slice a sur- 
vey in ways that will reveal what 
is really going on. Even better are 
large spectroscopic surveys, in com- 
bination with photometric redshifts. 

In the end, because of complex 
epoch-dependent, luminosity depen- 
dent, type-dependent bias, which is 
expected theoretically (and now seen 
observationally) to be worse at high 
redshift, one must accept the fact 
that galaxy clustering may never 
constitute a powerful cosmological 
test, even with all of the fantastic 
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Figure 4. j4 revised version of the "star formation history" diagram, with new 
points from the large ground-based surveys indicated with the crosses. The 
circles come from Lilly et al. (1996), the squares from Connolly et al. (1997), 
and the triangles from Madau et al. (1998). Note that with internally consis- 
tent corrections for extinction (see Steidel et al. 1999 for details), there is no 
indication for any significant change in the universal star formation density 
for any z > 1. 



data that will continue to roll in 
over the next several years. On the 
bright side, it seems that our ba- 
sic ideas about how galaxies form 
within halos of dark matter whose 
distribution is easily understood us- 
ing relatively simple statistics or N- 
body simulations are holding up very 
well, and (from an observer's point 
of view, at least) it is very encourag- 
ing that theorists and observers seem 
by and large to be talking about the 
same universe. There is enormous 
potential for progress in the area of 
understanding the interface between 
galaxy formation and structure for- 
mation, and it will involve a lot of 
interaction between theory and ob- 
servations. 

Studying large scale structure by 
using galaxies ultimately involves 
having to understand galaxies them- 
selves, and the problems of structure 
formation and galaxy formation are 
intimately related, and largely insep- 
arable, problems. The very obvious 
galaxy bias seen in the high redshift 
samples (at least, within the con- 



text of generic models in which struc- 
ture grows by gravitational instabil- 
ity from initial Gaussian perturba- 
tions) have perhaps emphasized the 
problem that has long been implicit 
in the theory— that galaxies are not 
to be trusted as reliable tracers of 
mass, and you should trust the young 
ones even less than the older ones. 
On the other hand, as eloquently 
pointed out by Carlos Frenk in his 
closing remarks, it is human nature 
that one's interest in anyone (or any- 
thing) that one understands or trusts 
quickly wanes, whereas the mysteri- 
ous and untrustworthy seem all the 
more attractive, despite one's bet- 
ter judgment. This probably means 
that many cosmologists will be try- 
ing to understand galaxy formation 
after all of the cosmological param- 
eters are sorted out by MAP and 
Planck. 
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